Affinity peptides and method for purification of recombinant proteins

ABSTRACT

This invention describes a process for separating a fusion protein or polypeptide in the form of its precursor from a mixture containing said fusion protein and impurities, which comprises contacting said fusion protein with a resin containing immobilized metal ions, said fusion protein covalently operably linked directly or indirectly to an immobilized metal ion-affinity peptide, binding said fusion protein to said resin, and selectively eluting said fusion protein from said resin.

REFERENCE TO RELATED APPLICATIONS

This application is a divisional of application Ser. No. 10/460,524,filed Jun. 12, 2003, which is a non-provisional application claimingpriority from provisional Application Ser. No. 60/388,059, filed Jun.12, 2002, the content of each of which is hereby incorporated herein byreference.

FIELD OF THE INVENTION

This invention relates to affinity peptides, fusion proteins containingaffinity peptides, genes coding for such proteins, expression vectorsand transformed microorganisms containing such genes, and methods forthe purification of the fusion proteins.

BACKGROUND OF THE INVENTION

The possibility of preparing hybrid genes by gene technology has openedup new routes for the analysis of recombinant proteins. By linking thecoding gene sequence of a desired protein to the coding gene sequence ofa protein fragment having a high affinity for a ligand (affinitypeptide), it is possible to purify desired recombinant proteins in theform of fusion proteins in one-step using the affinity peptide.

Immobilized metal affinity chromatography (IMAC), also known as metalchelate affinity chromatography (MCAC), is a specialized aspect ofaffinity chromatography. The principle behind IMAC lies in the fact thatmany transition metal ions, e.g., nickel, zinc and copper, cancoordinate to the amino acids histidine, cysteine, and tryptophan viaelectron donor groups on the amino acid side chains. To utilize thisinteraction for chromatographic purposes, the metal ion is typicallyimmobilized onto an insoluble support. This can be done by attaching achelating group to the chromatographic matrix. Most importantly, to beuseful, the metal of choice must have a higher affinity for the matrixthan for the compounds to be purified.

In U.S. Pat. No. 4,569,794, Smith et al. disclose the preparation of afusion protein containing a metal ion-affinity peptide linker and abiologically active polypeptide, expressing the fusion protein, andpurifying it using immobilized metal ion chromatography. Becauseessentially any biologically active polypeptide could be used, thisapproach enabled the convenient expression and purification ofessentially biologically active polypeptide by immobilized metal ionchromatography.

In U.S. Pat. Nos. 5,310,663 and 5,284,933, Dobeli et al. disclose aprocess for separating a biologically active polypeptide from impuritiesby producing the desired polypeptide as a fusion protein containing ametal ion-affinity peptide linker comprising 2 to 6 adjacent histidineresidues. Although Dobeli et al.'s metal ion-affinity peptide providesgreater metal affinity relative to certain of the sequences disclosed bySmith et al., there is some cautionary evidence that proteins containingHis-tags may differ from their wild-type counterparts indimerization/oligomerization properties. For example, Wu and Filutowiczpresent evidence that the biochemical properties of the pi(30.5) proteinof plasmid R6K, a DNA binding protein, were fundamentally altered due tothe presence of an N-terminal 6× His-tag. Wu, J. and Filutowicz, M.,Acta Biochim. Pol., 46:591-599, 1999. In addition, Rodriguez-Viciana etal. stated that V12 Ras proteins expressed as histidine-tagged fusionproteins exhibited poor biological activity. Rodriguez-Viciana, P., etal., Cell, 89:457-67, 1997.

SUMMARY OF THE INVENTION

One aspect of the present invention is a peptide which is relativelyhydrophilic, is capable of exhibiting appropriate biological activity,and has a relatively high affinity for coordinating metals.Advantageously, this metal ion-affinity peptide may be incorporated intoa fusion protein to enable ready purification of the fusion protein fromaqueous solutions by immobilized metal affinity chromatography. Inaddition to the metal ion-affinity peptide, the fusion protein typicallycomprises a protein or polypeptide of interest, covalently linked,directly or indirectly, to the metal ion-affinity peptide.

Briefly, therefore, the present invention is directed to a peptiderepresented by the formula R₁-Sp₁-(His-Z₁-His-Arg-His-Z₂-His)-Sp₂-R₂,wherein (His-Z₁-His-Arg-His-Z₂-His) (SEQ ID NO: 24) is a metalion-affinity peptide, R₁ is hydrogen, a polypeptide, protein or proteinfragment, Sp₁ is a covalent bond or a spacer comprising at least oneamino acid residue, R₂ is hydrogen, a polypeptide, protein or proteinfragment, Sp₂ is a covalent bond or a spacer comprising at least oneamino acid residue, Z₁ is an amino acid residue selected from the groupconsisting of Ala, Arg, Asn, Asp, Gln, Glu, Ile, Lys, Phe, Pro, Ser,Thr, Trp, and Val; and Z₂ is an amino acid residue selected from thegroup consisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, Ile, Leu,Lys, Met, Pro, Ser, Thr, Tyr, and Val.

The present invention is further directed to a process for separating arecombinant protein or polypeptide from a liquid mixture wherein therecombinant protein or polypeptide comprises a metal ion-affinitypeptide having the sequence His-Z₁-His-Arg-His-Z₂-His (SEQ ID NO: 24)and Z₁ and Z₂ are as previously defined. In the process, the mixture iscombined with a solid support having immobilized metal ions to bind therecombinant protein or polypeptide, and eluting the fusion protein fromthe solid support.

The present invention is further directed to vectors and host cells forrecombinant expression of the nucleic acid molecules described herein,as well as methods of making such vectors and host cells and for usingthem for production of the polypeptides or peptides of the presentinvention by recombinant techniques.

The present invention is further directed to a kit for the expressionand/or separation of the recombinant proteins or polypeptides from amixture wherein the recombinant proteins or polypeptides contain thesequence R₁-Sp₁-(His-Z₁-His-Arg-His-Z₂-His)-Sp₂-R₂, and R₁, R₂, Sp₁,Sp₂, Z₁ and Z₂ are as previously defined. The kit may comprise, inseparate containers, the nucleic acid components to be assembled into avector encoding for a fusion protein comprising a protein or polypeptidecovalently operably linked directly or indirectly to an immobilizedmetal ion-affinity peptide. In addition, or alternatively, the kit maybe comprised of one or more of the following: buffers, enzymes, achromatography column comprising a resin containing immobilized metalions and an instructional brochure explaining how to use the kit.

Other objects and advantages of the present invention will becomeapparent as the detailed description of the invention proceeds.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The present invention generally relates to the expression andpurification of recombinant polypeptides, proteins or protein fragmentscontaining a metal ion-affinity peptide. In addition to the metalion-affinity peptide, the recombinant polypeptides and proteins willtypically also contain a target polypeptide, protein or fragment thereofcovalently linked to the metal ion-affinity peptide. In one embodiment,the target polypeptide, protein or protein fragment is a biologicallyactive protein or protein fragment. Advantageously, the metalion-affinity peptide enables the recombinant polypeptides and proteinsto be readily purified from a liquid sample by means of metal ionaffinity chromatography.

The fusion proteins of this invention are prepared by recombinant DNAmethodology. In accordance with the present invention, a gene sequencecoding for a desired protein is isolated, synthesized or otherwiseobtained and operably linked to a DNA sequence coding for the metalion-affinity peptide. The hybrid gene containing the gene for a desiredprotein operably linked to a DNA sequence encoding the metalion-affinity peptide is referred to as a chimeric gene.

In one embodiment, the metal ion-affinity peptide is covalently linkedto the carboxy terminus of the target polypeptide, protein or proteinfragment. In another embodiment, the metal ion-affinity peptide iscovalently linked to the amino terminus of the target polypeptide,protein or protein fragment. In each of these embodiments, the metalion-affinity peptide and the target polypeptide, protein or proteinfragment may be directly attached by means of a peptide bond or,alternatively, the two may be separated by a linker. When present, thelinker may provide other functionality to the recombinant polypeptide,protein or protein fragment.

The recombinant polypeptides, proteins or protein fragments of thepresent invention are defined by the general formula (I):

R₁-Sp₁-(His-Z₁-His-Arg-His-Z₂-H is)-Sp₂-R₂   (I)

wherein (His-Z₁-His-Arg-His-Z₂-His) (SEQ ID NO: 24) is a metalion-affinity peptide; Z₁ is an amino acid residue selected from thegroup consisting of Ala, Arg, Asn, Asp, Gln, Glu, Ile, Lys, Phe, Pro,Ser, Thr, Trp, and Val; and Z₂ is an amino acid residue selected fromthe group consisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, Ile,Leu, Lys, Met, Pro, Ser, Thr, Tyr and Val. In addition, R₁ is hydrogen,a polypeptide, protein or protein fragment, Sp₁ is a covalent bond or aspacer comprising at least one amino acid residue, R₂ is hydrogen, apolypeptide, protein or protein fragment, Sp₂ is a covalent bond or aspacer comprising at least one amino acid residue. Thus, for example, R₁or R₂ may comprise a target polypeptide, protein, or protein fragmentwhich is directly or indirectly linked to the metal ion-affinitypeptide.

Metal Ion-Affinity Peptide

In one embodiment, the recombinant polypeptide, protein or proteinfragment is defined by formula (I), wherein Z₁ is an amino acid selectedfrom the group consisting of Ala, Asn, Ile, Lys, Phe, Ser, Thr, and Val;and Z₂ is an amino acid selected from the group consisting of Ala, Asn,Gly, Lys, Ser, Thr, Tyr; and R₁, R₂, Sp₁, and Sp₂ are as previouslydefined. Thus, for example, in this embodiment the target polypeptide,protein or protein fragment (R₁ or R₂) may be at the carboxy or aminoterminus of the metal ion-affinity polypeptide. In addition, the targetpolypeptide, protein or protein fragment (R₁ or R₂), may be directlyfused (when Sp₁ or Sp₂ is a covalent bond) or separated from the metalion-affinity polypeptide by a spacer (when Sp₁ or Sp₂ is one or moreamino acid residues) regardless of whether the target polypeptide,protein or protein fragment is fused to the amino or carboxy terminus ofthe metal ion-affinity polypeptide.

In another embodiment, the recombinant polypeptide, protein or proteinfragment is defined by formula (I), wherein Z₁ is an amino acid selectedfrom the group consisting of Asn and Lys; and Z₂ is an amino acidselected from the group consisting of Gly and Lys; and R₁, R₂, Sp₁, andSp₂ are as previously defined. For example, in one such embodiment, therecombinant polypeptide, protein or protein fragment is defined byformula (I) wherein Z₁ is Asn, Z₂ is Lys and R₁, R₂ Sp₁, and Sp₂ are aspreviously defined. By way of further example, in another suchembodiment, the recombinant polypeptide, protein or protein fragment isdefined by formula (I) wherein Z₁ is Lys and Z₂ is Gly. In each of thesealternatives, the target polypeptide, protein or protein fragment (R₁ orR₂) may be at the carboxy or amino terminus of the metal ion-affinitypolypeptide. In addition, the target polypeptide, protein or proteinfragment (R₁ or R₂), may be directly fused (when Sp₁ or Sp₂ is acovalent bond) or separated from the metal ion-affinity polypeptide by aspacer (when Sp₁ or Sp₂ is one or more amino acid residues) regardlessof whether the target polypeptide, protein or protein fragment is fusedto the amino or carboxy terminus of the metal ion-affinity polypeptide.

In another embodiment, the recombinant polypeptide, protein or proteinfragment is defined by formula (I), wherein Z₁ is Ile, Z₂ is Asn, andR₁, R₂, Sp₁, and Sp₂ are as previously defined. Thus, for example, inthis embodiment the target polypeptide, protein or protein fragment (R₁or R₂) may be at the carboxy or amino terminus of the metal ion-affinitypolypeptide. In addition, the target polypeptide, protein or proteinfragment (R₁ or R₂), may be directly fused (when Sp₁ or Sp₂ is acovalent bond) or separated from the metal ion-affinity polypeptide by aspacer (when Sp₁ or Sp₂ is one or more amino acid residues) regardlessof whether the target polypeptide, protein or protein fragment is fusedto the amino or carboxy terminus of the metal ion-affinity polypeptide.

In another embodiment, the recombinant polypeptide, protein or proteinfragment is defined by formula (I), wherein Z₁ is Thr, Z₂ is Ser, andR₁, R₂, Sp₁, and Sp₂ are as previously defined. Thus, for example, inthis embodiment the target polypeptide, protein or protein fragment (R₁or R₂) may be at the carboxy or amino terminus of the metal ion-affinitypolypeptide. In addition, the target polypeptide, protein or proteinfragment (R₁ or R₂), may be directly fused (when Sp₁ or Sp₂ is acovalent bond) or separated from the metal ion-affinity polypeptide by aspacer (when Sp₁ or Sp₂ is one or more amino acid residues) regardlessof whether the target polypeptide, protein or protein fragment is fusedto the amino or carboxy terminus of the metal ion-affinity polypeptide.

In another embodiment, the recombinant polypeptide, protein or proteinfragment is defined by formula (I), wherein Z₁ is Ser, Z₂ is Tyr, andR₁, R₂, Sp₁, and Sp₂ are as previously defined. Thus, for example, inthis embodiment the target polypeptide, protein or protein fragment (R₁or R₂) may be at the carboxy or amino terminus of the metal ion-affinitypolypeptide. In addition, the target polypeptide, protein or proteinfragment (R₁ or R₂), may be directly fused (when Sp₁ or Sp₂ is acovalent bond) or separated from the metal ion-affinity polypeptide by aspacer (when Sp₁ or Sp₂ is one or more amino acid residues) regardlessof whether the target polypeptide, protein or protein fragment is fusedto the amino or carboxy terminus of the metal ion-affinity polypeptide.

In another embodiment, the recombinant polypeptide, protein or proteinfragment is defined by formula (I), wherein Z₁ is Val, Z₂ is Ala, andR₁, R₂, Sp₁, and Sp₂ are as previously defined. Thus, for example, inthis embodiment the target polypeptide, protein or protein fragment (R₁or R₂) may be at the carboxy or amino terminus of the metal ion-affinitypolypeptide. In addition, the target polypeptide, protein or proteinfragment (R₁ or R₂), may be directly fused (when Sp₁ or Sp₂ is acovalent bond) or separated from the metal ion-affinity polypeptide by aspacer (when Sp₁ or Sp₂ is one or more amino acid residues) regardlessof whether the target polypeptide, protein or protein fragment is fusedto the amino or carboxy terminus of the metal ion-affinity polypeptide.

In another embodiment, the recombinant polypeptide, protein or proteinfragment is defined by formula (I), wherein Z₁ is Ala, Z₂ is Lys, andR₁, R₂, Sp₁, and Sp₂ are as previously defined. Thus, for example, inthis embodiment the target polypeptide, protein or protein fragment (R₁or R₂) may be at the carboxy or amino terminus of the metal ion-affinitypolypeptide. In addition, the target polypeptide, protein or proteinfragment (R₁ or R₂), may be directly fused (when Sp₁ or Sp₂ is acovalent bond) or separated from the metal ion-affinity polypeptide by aspacer (when Sp₁ or Sp₂ is one or more amino acid residues) regardlessof whether the target polypeptide, protein or protein fragment is fusedto the amino or carboxy terminus of the metal ion-affinity polypeptide.

In a further embodiment, R₁ may be a polypeptide which drives expressionof the fusion protein and R₂ is the target polypeptide, protein orprotein fragment. In this embodiment, each of Sp₁ and Sp₂ may be acovalent bond or a spacer, independently of the other. Thus, forexample, R₁ may be directly fused to the metal ion-affinity peptide orseparated from the metal ion-affinity peptide by a spacer independentlyof whether R₂ is directly fused to the metal ion-affinity peptide orseparated from the metal ion-affinity peptide by a spacer; all of thesecombinations and permutations are contemplated. This type of arrangementis particularly useful when chimeric proteins are constructed whichcomprise epitopes from two portions of antigenic protein or from twodifferent antigenic proteins. Such chimeric proteins may be useful invaccine preparations.

In another embodiment, the recombinant polypeptides, proteins or proteinfragments of the present invention comprise multiple copies of the metalion-affinity peptide (His-Z₁-His-Arg-His-Z₂-His) (SEQ ID NO: 24) whereinZ₁ and Z₂ are as previously defined. In this embodiment, the additionalcopies of the metal affinity peptide may occur in either or both of thespacer domains (Sp₁ and Sp₂) or in either or both of the other domains(R₁ and R₂) of the recombinant polypeptides, proteins or proteinfragments. Thus, for example, in one embodiment a second copy of themetal ion-affinity peptide (His-Z₁-His-Arg-His-Z₂-His) (SEQ ID NO: 24)wherein Z₁ and Z₂ are as previously defined is located in one of thespacer domains (Sp₁ or Sp₂) or other domains (R₁ and R₂) of therecombinant polypeptides, proteins or protein fragments. By way offurther example, in another embodiment two additional copies of themetal ion-affinity peptide (His-Z₁-His-Arg-His-Z₂-His) (SEQ ID NO: 24)wherein Z₁ and Z₂ are as previously defined are located in the spacerdomains (Sp₁ or Sp₂) or other domains (R₁ and R₂) of the recombinantpolypeptides, proteins or protein fragments. By way of further example,in another embodiment at least three additional copies of the metalion-affinity peptide (His-Z₁-His-Arg-His-Z₂-His) (SEQ ID NO: 24) whereinZ₁ and Z₂ are as previously defined are located in the spacer domains(Sp₁ or Sp₂) or other domains (R₁ and R₂) of the recombinantpolypeptides, proteins or protein fragments. In each of theseembodiments, the multiple copies of the metal ion-affinity peptide maybe separated by one or more amino acid residues (i.e., a spacer) asdescribed herein. Alternatively, in each of these embodiments themultiple copies of the metal ion-affinity peptide may be directly linkedto each other without any intervening amino acid residues. Thus, forexample, in one such embodiment the recombinant polypeptides, proteinsor protein fragments of the present invention may be defined by thegeneral formula (II):

R₁-Sp₁-(His-Z₁-His-Arg-His-Z₂-His)_(t)-Sp₂-R₂   (II)

wherein (His-Z₁-His-Arg-His-Z₂-His) (SEQ ID NO: 24) is a metalion-affinity peptide; t is at least 2 and R₁, R₂, Z₁, Z₂, Sp₁ and Sp₂are as previously defined. By way of further example, in one suchembodiment the recombinant polypeptides, proteins or protein fragmentsof the present invention may be defined by the general formula (III):

R₁-Sp₁-[(His-Z₁-His-Arg-His-Z₂-His)-Sp₂]_(t)-R₂   (III)

wherein (His-Z₁-His-Arg-His-Z₂-His) (SEQ ID NO: 24) is a metalion-affinity peptide; t is at least 2 and R₁, R₂, Z₁, Z₂, Sp₁ and Sp₂are as previously defined; in addition, each Sp₂ of the recombinantpolypeptides, proteins or protein fragments corresponding to generalformula (III) may be the same or different.

Target Polypeptide, Protein or Protein Fragment

The target polypeptide, protein or protein fragment may be composed ofany proteinaceous substance that can be expressed in transformed hostcells. Accordingly, the present invention may be beneficially employedto produce substantially any prokaryotic or eukaryotic, simple orconjugated, protein that can be expressed by a vector in a transformedhost cell. For example, the target protein may be

-   -   a) an enzyme, whether oxidoreductase, transferase, hydrolase,        lyase, isomerase or ligase;    -   b) a storage protein, such as ferritin or ovalbumin or a        transport protein, such as hemoglobin, serum albumin or        ceruloplasmin;    -   c) a protein that functions in contractile and motile systems        such as actin or myosin;    -   d) any of a class of proteins that serve a protective or defense        function, such as the blood protein fibrinogen or a binding        protein, such as antibodies or immunoglobulins that bind to and        thus neutralize antigens;    -   e) a hormone such as human Growth Hormone, somatostatin,        prolactin, estrone, progesterone, melanocyte, thyrotropin,        calcitonin, gonadotropin and insulin;    -   f) a hormone involved in the immune system, such as        interleukin-1, interleukin-2, colony stimulating factor,        macrophage-activating factor and interferon;    -   g) a toxic protein, such as ricin from castor bean or gossypin        from cotton linseed;    -   h) a protein that serves as structural elements such as        collagen, elastin, alpha-keratin, glyco-proteins, viral proteins        and muco-proteins; or    -   i) a synthetic protein, defined generally as any sequence of        amino acids not occurring in nature.        In general, the target polypeptide, protein or protein fragment        may be a constituent of the R₁ and R₂ moieties of the        recombinant polypeptides, proteins or protein fragments        corresponding to general formulae (I), (II) and (III).

Genes coding for the various types of protein molecules identified abovemay be obtained from a variety of prokaryotic or eukaryotic sources,such as plant or animal cells or bacteria cells. The genes can beisolated from the chromosome material of these cells or from plasmids ofprokaryotic cells by employing standard, well-known techniques. Avariety of naturally occurring and synthesized plasmids having genescoding for many different protein molecules are not commerciallyavailable from a variety of sources. The desired DNA also can beproduced from mRNA by using the enzyme reverse transcriptase. Thisenzyme permits the synthesis of DNA from an RNA template.

In one embodiment, R₁ may be a protein which enhances expression and R₂is the target polypeptide, protein, or protein fragment. It is wellknown that the presence of some proteins in a cell result in expressionof genes. If a chimeric protein contains an active portion of theprotein which prompts or enhances expression of the gene encoding it,greater quantities of the protein may be expressed than if it were notpresent.

Linker and Other Optional Elements

In one embodiment, the recombinant polypeptide, protein or proteinfragment includes a spacer (Sp₁ or Sp₂) between the metal ion-affinitypolypeptide and the target polypeptide, protein or protein fragment. Ifpresent, the spacer may simply comprise one or more, e.g., three to tenamino acid residues, separating the metal ion-affinity peptide from thetarget polypeptide, protein or protein fragment. Alternatively, thespacer may comprise a sequence which imparts other functionality, suchas a proteolytic cleavage site, a fusion protein, a secretion sequence(e.g. OmpA or OmpT for E. coli, preprotrypsin for mammalian cells,a-factor for yeast, and melittin for insect cells), a leader sequencefor cellular targeting, antibody epitopes, or IRES (internal ribosomalentry sequences) sequences.

In one embodiment, the spacer is selected from among hydrophilic aminoacids to increase the hydrophilic character of the recombinantpolypeptide, protein or protein fragment. Alternatively, the aminoacid(s) of the spacer domain may be selected to impart a desired foldingto the recombinant polypeptide, protein or protein fragment therebyincreasing accessability to one or more regions of the molecule. Forexample, the spacer domain may comprise glycine residues which resultsin a protein folding conformation which allows for improvedaccessibility to antibodies.

In another embodiment, the spacer comprises a cleavage site whichconsists of a unique amino acid sequence cleavable by use of asequence-specific proteolytic agent. Such a site would enable the metalion-affinity polypeptide to be readily cleaved from the targetpolypeptide, protein or protein fragment by digestion with a proteolyticagent specific for the amino acids of the cleavage site. Alternatively,the metal ion-affinity peptide may be removed from the desired proteinby chemical cleavage using methods known to the art.

When present, the cleavable site may be located at the amino or carboxyterminus of the target peptide. Preferably, the cleavable site isimmediately adjacent the desired protein to enable separation of thedesired protein from the metal ion-affinity peptide. This cleavable sitepreferably does not appear in the desired protein. In one embodiment,the cleavable site is located at the amino terminus of the desiredprotein. If the cleavable site is located at the amino terminus of thedesired protein and if there are remaining extraneous amino acids on thedesired protein after cleavage with the proteolytic agent, anendopeptidase such as trypsin, clostropain or furin may be utilized toremove these remaining amino acids, thus resulting in a highly purifieddesired protein. Further examples of proteolytic enzymatic agents usefulfor cleavage are papain, pepsin, plasmin, thrombin, enterokinase, andthe like. Each effects cleavage at a particular amino acid sequencewhich it recognizes.

Digestion with a proteolytic agent may occur while the fusion protein isstill bound to the affinity resin or alternatively, the fusion proteinmay be eluted from the affinity resin and then digested with theproteolytic agent in order to further purify the desired protein.Preferably, the amino acid sequence of the proteolytic cleavage site isunique, thus minimizing the possibility that the proteolytic agent willcleave the desired protein. In one embodiment, the cleavable sitecomprises amino acids for an enterokinase, thrombin or a Factor Xacleavage site.

Enterokinase recognizes several sequences: Asp-Lys; Asp-Asp-Lys;Asp-Asp-Asp-Lys (SEQ ID NO: 25); and Asp-Asp-Asp-Asp-Lys (SEQ ID NO:26). The only known natural occurrence of Asp-Asp-Asp-Asp-Lys (SEQ IDNO: 26) is in the protein trypsinogen which is a natural substrate forbovine enterokinase and some yeast proteins. As such, by interposing afragment containing the amino acid sequence Asp-Asp-Asp-Asp-Lys (SEQ IDNO: 26) as a cleavable site between the metal ion-affinity polypeptideand the amino terminus of the target polypeptide, protein or proteinfragment, the metal ion-affinity polypeptide can be liberated from thedesired protein by use of bovine enterokinase with very littlelikelihood that this enzyme will cleave any portion of the desiredprotein itself.

Thrombin cleaves on the carboxy-terminal side of arginine in thefollowing sequence: Leu-Val-Pro-Arg-Gly-X (SEQ ID NO: 27), where X is anon-acidic amino acid. Factor Xa protease (i.e., the activated form ofFactor X) cleaves after the Arg in the following sequences:Ile-Glu-Gly-Arg-X (SEQ ID NO: 28), Ile-Asp-Gly-Arg-X (SEQ ID NO: 29),and Ala-Glu-Gly-Arg-X (SEQ ID NO: 30), where X is any amino acid exceptproline or arginine. A fusion protein comprising the 31 amino-terminalresidues of the cII protein, a Factor Xa cleavage site and humanβ-globin was shown to be cleaved by Factor Xa and generate authenticβ-globin. A limitation of the Factor Xa-based fusion systems is the factthat Factor Xa has been reported to cleave at arginine residues that arenot present within in the Factor Xa recognition sequence. Lauritzen, C.et al., Protein Expr. and Purif., 5-6:372-378 (1991).

While less preferred, other unique amino acid sequences for othercleavable sites may also be employed in the spacer without departingfrom the spirit or scope of the present invention. For instance, thespacer may be composed, at least in part, of a pair of basic aminoacids, i.e., Arg, His or Lys. This sequence is cleaved by kallikreins, aglandular enzyme. Also, the spacer may be composed, at least in part, ofArg-Gly, since it is known that the enzyme thrombin will cleave afterthe Arg if this residue is followed by Gly.

Regardless of whether a cleavage site is present, the recombinantpolypeptide, protein or protein fragment may comprise an antigenicdomain in a spacer region (Sp₁ or Sp₂). For example, in one embodimentof the present invention, the recombinant polypeptide, protein orprotein fragment comprises one or multiple copies of an antigenic domaingenerally corresponding to the FLAG® (Sigma-Aldrich, St. Louis, Mo.)peptide sequence joined to a linking sequence containing a singleenterokinase cleavage site. Such antigenic domains generally correspondto the sequence:

X²⁰—(X¹—Y—K—X²—X³-D-X⁴)_(n)—X⁵—(X¹—Y—K—X⁷—X⁸-D-X⁹—K)—X²¹ (SEQ ID NO: 39)

where:

D, Y and K are their representative amino acids;

X²⁰ and X²¹ are independently a hydrogen or a covalent bond;

each X¹ and X⁴ is independently a covalent bond or at least one aminoacid residue, if other than a covalent bond, preferably at least oneamino acid residue selected from the group consisting of aromatic aminoacid residues and hydrophilic amino acid residues, more preferably atleast one hydrophilic amino acid residue, and still more preferably atleast one an aspartate residue;

each X², X³, X⁷ and X⁸ is independently an amino acid residue,preferably an amino acid residue selected from the group consisting ofaromatic amino acid residues and hydrophilic amino acid residues, morepreferably a hydrophilic amino acid residue, and still more preferablyan aspartate residue;

X⁵ is a covalent bond or a spacer domain comprising at least one aminoacid, if other than a covalent bond, preferably a histidine residue, aglycine residue or a combination of multiple or alternating histidineresidues, said combination comprising His-Gly-His, or -(His-X)_(m)—,wherein m is 1 to 6 and X is selected from the group consisting of Ala,Arg, Asn, Asp, Cys, Gln, Glu, Gly, His, Ile, Leu, Lys, Met, Phe, Pro,Ser, Thr, Trp, Tyr and Val;

X⁹ is a covalent bond or D; and

n is 0, 1 or 2.

In this embodiment, the amino acid sequence X²⁰—(X¹—Y—K—X²—X³-D-X⁴)_(n)(SEQ ID NO: 35) comprises an antigenic domain —X¹—Y—K—X²—X³-D- (SEQ IDNO: 36) joined in tandem which are joined to a linking sequence(X¹—Y—K—X⁷—X⁸-D-X⁹—K) (SEQ ID NO: 37). The antigenic domains may beimmediately adjacent to each other when n is at least one and X⁴ is acovalent bond; optionally, X⁴ may be a spacer domain interposed betweenthe multiple copies of antigenic domains. The linking sequence containsa single enterokinase cleavable site which is represented by thesequence —X⁷—X⁸-D-X⁹—K (SEQ ID NO: 38), where X⁷ and X⁸ may be an aminoacid residue or a covalent bond and X⁹ is a covalent bond or anaspartate residue. In one embodiment, each X⁷, X⁸ and X⁹ isindependently an aspartate residue thus resulting in the enterokinasecleavable site DDDDK (SEQ ID NO: 26) which is preferably locatedimmediately adjacent to the amino terminus of the target peptide. When nis at least one and X⁵ is a covalent bond, the multiple copies ofantigenic domains may be immediately adjacent to the linking sequence;optionally, X⁵ may be a spacer domain interposed between the linkingsequence and the antigenic domains. When each X⁴ and X⁵ is independentlya spacer domain, it is preferred that the amino acid residue(s) of eachX⁴ and X⁵ impart one or more desired properties to the antigenic domain;for example, the amino acids of the spacer domain may be selected toimpart a desired folding to the identification polypeptide therebyincreasing accessibility to the antibody. In another embodiment, theamino acids of the spacer domain X⁴ and X⁵ may be selected to impart adesired affinity characteristic such as a combination of multiple oralternating histidine residues capable of chelating to an immobilizedmetal ion on a resin or other matrix. Furthermore, these desiredproperties may be designed into other areas of the identificationpolypeptide; for example, the amino acids represented by X² and X³ maybe selected to impart a desired peptide folding or a desired affinitycharacteristic for use in affinity purification.

In another embodiment, the spacer comprises multiple copies of anantigenic domain. For example, in one embodiment the spacer may comprisea linking sequence containing a single enterokinase or other cleavagesite, or generally correspond to the sequence:

X²⁰-(D-Y—K—X²—X³-D)_(n)-X⁵-(D-Y—K—X⁷—X⁸-D-X⁹—K)—X²¹ (SEQ ID NO: 40)

where:

D, Y, K are their representative amino acids;

X²⁰ and X²¹ are independently a hydrogen or a covalent bond; each X²,X³, X⁷ and X⁸ is independently an amino acid residue, preferably anamino acid residue selected from the group consisting of aromatic aminoacid residues and hydrophilic amino acid residues, more preferably ahydrophilic amino acid residue, and still more preferably an aspartateresidue;

X⁵ is a covalent bond or a spacer domain comprising at least one aminoacid, if other than a covalent bond, preferably a histidine residue, aglycine residue or a combination of multiple or alternating histidineresidues, said combination comprising His-Gly-His, or -(His-X)_(m)—,wherein m is 1 to 6 and X is selected from the group consisting of Ala,Arg, Asn, Asp, Cys, Gln, Glu, Gly, His, Ile, Leu, Lys, Met, Phe, Pro,Ser, Thr, Trp, Tyr and Val;

X⁹ is a covalent bond or an aspartate residue; and

n is at least 2.

In this embodiment, the amino acid sequence X²⁰-(D-Y—K—X²—X³-D)_(n) (SEQID NO: 41) represents the multiple copies of the antigenic domainD-Y—K—X²—X³-D (SEQ ID NO: 31) in tandem which are joined to a linkingsequence (D-Y—K—X⁷—X⁸-D-X⁹—K) (SEQ ID NO: 32). In this embodiment, oneantigenic domain is immediately adjacent to another antigenic domain,i.e., no intervening spacer domains, and the multiple copies of theantigenic domain are immediately adjacent to the linking sequence whenX⁵ is a covalent bond. The linking sequence contains a singleenterokinase cleavable site which is represented by the sequence—X⁷—X⁸-D-X⁹-K (SEQ ID NO: 38), where X⁷ and X⁸ may be a covalent bond oran amino acid residue, preferably an aspartate residue, and X⁹ is acovalent bond or an aspartate residue. In one embodiment, each X⁷, X⁸and X⁹ is independently an aspartate residue thus resulting in theenterokinase cleavable site DDDDK (SEQ ID NO: 26) which is preferablyadjacent to the amino terminus of the target peptide. Optionally, themultiple copies of the antigenic domain are joined to the linkingsequence by a spacer X⁵ when X⁵ is at least one amino acid residue. WhenX⁵ is a spacer domain, it is preferred that the amino acid residue(s) ofX⁵ impart one or more desired properties to the recombinant polypeptide,protein or protein fragment; for example, the amino acids of the spacerdomain may be selected to impart a desired folding to the recombinantpolypeptide, protein or protein fragment thereby increasingaccessibility to the antibody. In another embodiment, the amino acids ofthe spacer domain may be selected to impart a desired affinitycharacteristic such as a combination of multiple or alternatinghistidine residues capable of chelating to an immobilized metal ion on aresin or other matrix. Furthermore, these desired properties may bedesigned into other areas of the spacer; for example, the amino acidsrepresented by X² and X³ may be selected to impart a desired peptidefolding or a desired affinity characteristic for use in affinitypurification.

When the affinity polypeptide is located at the amino terminus of thetarget polypeptide, protein or protein fragment, it is often desirableto design the amino acid sequence such that an initiator methionine ispresent. Accordingly, in one embodiment of the present invention, therecombinant polypeptide, protein or protein fragment comprises multiplecopies of an antigenic domain, a linking sequence containing a singleenterokinase cleavage site and generally corresponds to the sequence:

X²⁰—X¹⁰-(D-Y—K—X²—X³-D)_(n)-X⁵-(D-Y—K—X⁷—X⁸-D-X⁹—K)—X²¹ (SEQ ID NO: 45)

where:

D, Y, and K are their representative amino acids;

X²⁰ and X²¹ are independently a hydrogen or a covalent bond;

X¹⁰ is a covalent bond or an amino acid, if other than a covalent bond,preferably a methionine residue;

each X², X³, X⁷ and X⁵ is independently an amino acid residue,preferably an amino acid residue selected from the group consisting ofaromatic amino acid residues and hydrophilic amino acid residues, morepreferably a hydrophilic amino acid residue, and still more preferablyan aspartate residue;

X⁵ is a covalent bond or a spacer domain comprising at least one aminoacid, if other than a bond, preferably a histidine residue, a glycineresidue or a combination of multiple or alternating histidine residues,said combination comprising His-Gly-His, or -(His-X)_(m)—, wherein m is1 to 6 and X is selected from the group consisting of Ala, Arg, Asn,Asp, Cys, Gln, Glu, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr,Trp, Tyr, and Val;

X⁹ is a covalent bond or an aspartate residue; and

n is at least 2.

In this embodiment, the amino acid sequence X²⁰—X¹⁰-(D-Y—K—X²—X³-D)_(n)(SEQ ID NO: 44) represents the multiple copies of the antigenic domainD-Y—K—X²—X³-D (SEQ ID NO: 31) in tandem which is flanked by a linkingsequence (D-Y—K—X⁷—X⁸-D-X⁹—K) (SEQ ID NO: 32) and an initiator aminoacid X¹⁰, preferably methionine. The antigenic domain D-Y—K—X²—X³-D withan initiator methionine is recognized by the M5® antibody(Sigma-Aldrich, St. Louis, Mo.). In this embodiment, one antigenicdomain is immediately adjacent to another antigenic domain, i.e., nointervening spacer domains, and the multiple copies of the antigenicdomain are immediately adjacent to the linking sequence when X⁵ is acovalent bond. The linking sequence contains an enterokinase cleavablesite which is represented by the amino acid sequence —X⁷—X⁸-D-X⁹—K (SEQID NO: 38), where X⁷ and X⁸ may be a covalent bond or an amino acidresidue, preferably an aspartate residue, and X⁹ is a covalent bond oran aspartate residue. In one embodiment, each X⁷, X⁸ and X⁹ isindependently an aspartate residue thus resulting in the enterokinasecleavable site DDDDK (SEQ ID NO: 26) which is preferably adjacent to theamino terminus of the target peptide. Optionally, the multiple copies ofthe antigenic domain are joined to the linking sequence by a spacerdomain X⁵ whenX⁵ is at least one amino acid residue. When X⁵ is a spacerdomain, it is preferred that the amino acid residue(s) of X⁵ impart oneor more desired properties to the affinity polypeptide; for example, theamino acids of the spacer domain may be selected to impart a desiredfolding to the recombinant polypeptide, protein or protein fragmentthereby increasing accessibility to the antibody. In another embodiment,the amino acids of the spacer domain may be selected to impart a desiredaffinity characteristic such as a combination of multiple or alternatinghistidine residues capable of chelating to an immobilized metal ion on aresin or other matrix. Furthermore, these desired properties may bedesigned into other areas of the affinity polypeptide; for example, theamino acids represented by X² and X³ may be selected to impart a desiredpeptide folding or a desired affinity characteristic for use in affinitypurification.

In another embodiment of the present invention, the recombinantpolypeptide, protein or protein fragment comprises one or more copies ofan antigenic sequence, a linking sequence containing a singleenterokinase cleavable site and generally corresponds to the sequence:

X²⁰-(D-X¹¹—Y—X¹²—X¹³)_(n)—X¹⁴-(D-X¹¹—Y—X¹²—X¹³-D-X¹⁵—K)—X²¹ (SEQ ID NO:42)

where:

D, Y and K are their representative amino acids;

X²⁰ and X²¹ are independently a hydrogen or a covalent bond;

each X¹¹ is a covalent bond or an amino acid, preferably Leu;

each X¹² is an amino acid, preferably selected from the group consistingof aromatic amino acid residues and hydrophilic amino acid residues,more preferably a hydrophilic amino acid residue, and still morepreferably an aspartate residue;

each X¹³ is a covalent bond or at least one amino acid, if other than acovalent bond, preferably selected from the group consisting of aromaticamino acid residues and hydrophilic amino acid residues, more preferablya hydrophilic amino acid residue, and still more preferably an aspartateresidue;

X¹⁴ is a covalent bond or a spacer domain comprising at least one aminoacid, if other than a covalent bond, preferably a histidine residue, aglycine residue or a combination of multiple or alternating histidineresidues, said combination comprising His-Gly-His, or -(His-X)_(m)—,wherein m is 1 to 6 and X is selected from the group consisting of Ala,Arg, Asn, Asp, Cys, Gln, Glu, Gly, His, Ile, Leu, Lys, Met, Phe, Pro,Ser, Thr, Trp, Tyr and Val;

X¹⁵ is a covalent bond or an aspartate residue; and

n is 0 or at least 1.

In this embodiment, when n is at least 2, the amino acid sequenceX²⁰-(D-X¹¹—Y—X¹²—X¹³)_(n) (SEQ ID NO: 43) constitutes multiple copies ofthe antigenic domain D-X¹¹—Y—X¹²—X¹³ (SEQ ID NO: 33) in tandem which arejoined to a linking sequence (D-X¹¹—Y—X¹²—X¹³-D-X¹⁵—K) (SEQ ID NO: 34).Additionally, one antigenic domain may be immediately adjacent toanother antigenic domain, i.e., no intervening spacer domains, and themultiple copies of the antigenic domain may be immediately adjacent tothe linking sequence when X¹⁴ is a covalent bond. The linking sequencecontains a single enterokinase cleavable site which is represented bythe sequence —X¹²—X¹³-D-X¹⁵—K, (SEQ ID NO: 38) where X¹² and X¹³ may bea covalent bond or an amino acid residue, preferably an aspartateresidue, and X¹⁵ is a covalent bond or an aspartate residue. In oneembodiment, each X¹², X¹³ and X¹⁵ is independently an aspartate residuethus resulting in the enterokinase cleavable site DDDDK (SEQ ID NO: 26)which is preferably adjacent to the amino terminus of the targetpeptide. Optionally, when n is at least two, the multiple copies of theantigenic domain are joined to the linking sequence by a spacer X¹⁴ whenX¹⁴ is at least one amino acid residue. When X¹⁴ is a spacer domain, itis preferred that the amino acid residue(s) of X¹⁴ impart one or moredesired properties to the recombinant polypeptide, protein or proteinfragment; for example, the amino acids of the spacer domain may beselected to impart a desired folding to the recombinant polypeptide,protein or protein fragment thereby increasing accessibility to theantibody. In another embodiment, the amino acids of the spacer domainX¹⁴ may be selected to impart a desired affinity characteristic such asa combination of multiple or alternating histidine residues capable ofchelating to an immobilized metal ion on a resin or other matrix.

In another embodiment of this invention, a spacer (Sp₁ or Sp₂) comprisesthe enzyme glutathione-S-transferase of the parasite helminthSchistosoma japonicum (SEQ ID NO: 1). The glutathione-S-transferase may,however, be derived from other species including human and othermammalian glutathione-S-transferase. Proteins expressed as fusions withthe enzyme glutathione-S-transferase can be purified undernon-denaturing conditions by affinity chromatography on immobilizedglutathione. Glutathione-agarose beads have a capacity of at least 8 mgfusion protein/ml swollen beads and can be used several times fordifferent preparations of the same fusion protein. Smith, D. B. andJohnson, K. S., Gene, 67:31-40, 1988.

In another embodiment of this invention, a spacer (Sp₁ or Sp₂) comprisesa cellulose binding domain (CBD) (SEQ ID NO: 2). CBD's are found in bothbacterial and fungal sources and possess a high affinity for thecrystalline form of cellulose. This property has been useful forpurification of fusion proteins using a cellulose matrix. Fusionproteins have been attached at both the N- and C-terminus of CBD.

In another embodiment of this invention, a spacer (Sp₁ or Sp₂) comprisesthe Maltose Binding Protein (MBP) encoded by the malE gene in E. coli(SEQ ID NO: 3). MBP has found utility in the formation of chimericproteins with eukaryotic proteins for expression in bacterial systems.This system permits expression of soluble fusion proteins that canreadily be purified on immobilized amylose resin.

In another embodiment of this invention, a spacer (Sp₁ or Sp₂) comprisesProtein A (SEQ ID NO: 4). Protein A is isolated from Staphylococcusaureus and binds to the Fc origin of IgG. Fusion proteins containing theIgG binding domains of Protein A can be affinity purified on IgG resins(e.g., IgG Sepharose 6FF (Pharmacia Biotech). The signal sequence ofProtein A is functional in E. coli. Fusion proteins using Protein A haveshown increased stability when expressed both in the cytoplasm andperiplasm in E. coli.

In another embodiment of this invention, a spacer (Sp₁ or Sp₂) comprisesProtein G (SEQ ID NO: 5). Protein G is similar to Protein A with thedifference being that Protein G binds to human serum albumin in additionto IgG. The major disadvantage is that low pH<3.4 is required to elutethe fusion protein.

In another embodiment of this invention, a spacer (Sp₁ or Sp₂) comprisesIgG (SEQ ID NO: 6). Placing the protein of interest on the C-terminal ofIgG generates chimeric proteins. This allows purification of the fusionprotein using either Protein A or G matrix.

In another embodiment of this invention, a spacer (Sp₁ or Sp₂) comprisesthe enzyme chloramphenicol acetyl transferase (CAT) from E. coli (SEQ IDNO: 7). CAT is used in the form of a C-terminal fusion. CAT is readilytranslated in E. coli and allows for over-expression of heterologousproteins. Capture of fusion proteins is accomplished using achloramphenicol matrix.

In another embodiment of this invention, a spacer (Sp₁ or Sp₂) comprisesstreptavidin (SEQ ID NO: 8). Streptavidin is used for fusion proteinsbecause of its high affinity and high specificity for biotin.Streptavidin is a neutral protein, free from carbohydrates andsulphydryl groups.

In another embodiment of this invention, a spacer (Sp₁ or Sp₂) comprisesb-galactosidase (SEQ ID NO: 9). b-galactosidase is a enzyme that isutilized as both an N- and C-terminal fusion protein. Fusion proteinscontaining b-galactosidase sequences can be affinity purified onaminophenyl-b-D-thiogalactosidyl-succinyldiaminohexyl-Sepharose.However, given that C-terminal fusion proteins are usually insoluble,the system has limited use in bacterial systems. N-terminal fusions aresoluble in E. coli, but due to the large size of b-galactosidase, thissystem is used more often in eukaryotic gene expression.

In another embodiment of this invention, a spacer (Sp₁ or Sp₂) comprisesthe Green Fluorescent Protein (GFP) (SEQ ID NO: 10). GFP is a proteinfrom the jellyfish Aquorea victorea and many mutant variations of thisprotein have been used successfully in most organisms for proteinexpression. The major use of these types of fusion proteins is fortargeting and determining physiological function of the host cellprotein.

In another embodiment of this invention, a spacer (Sp₁ or Sp₂) comprisesthioredoxin (SEQ ID NO: 11). Thioredoxin is a relatively smallthermostable protein that is easily over-expressed in bacterial systems.Thioredoxin fusion systems are useful in avoiding the formation ofinclusion bodies during heterologous gene expression. This has beenparticularly useful in the expression of mammalian cytokines.

In another embodiment of this invention, a spacer (Sp₁ or Sp₂) comprisesCalmodulin Binding Protein (CBP) (SEQ ID NO: 12). This tag is derivedfrom the C-terminus of skeletal muscle myosin light chain kinase. Thissmall tag is recognized by calmodulin and forms the base of thetechnology. The tag is translated efficiently and allows for theexpression and recovery of N-terminal chimeric genes.

In another embodiment of this invention, a spacer (Sp₁ or Sp₂) comprisesthe c-myc epitope sequence Glu-Gln-Lys-Leu-Ile-Ser-Glu-Glu-Asp-Leu (SEQID NO: 13). This C-terminal portion of the myc oncogene, which is partof the p53 signaling pathway, has been used as a detection tag forexpression of recombinant proteins in mammalian cells.

In another embodiment of this invention, a spacer (Sp₁ or Sp₂) comprisesthe HA epitope sequence Tyr-Pro-Tyr-Asp-Val-Tyr-Ala (SEQ ID NO: 14).This detection tag has been utilized for the expression of recombinantproteins in mammalian cells.

In another embodiment of this invention, the spacer (Sp₁ or Sp₂)comprises a polypeptide possessing an amino acid sequence having atleast 70% homology to any one of the amino acid sequences disclosed inSEQ ID NOS:1-14, and retains the same binding characteristics as saidamino acid sequence.

DNA sequences encoding the aforementioned proteins which may be employedas spacers (Sp₁ or Sp₂) are commercially available (e.g., malE genesequences encoding the MBP are available from New England Biolabs(pMAL-c2 and pMAL-p2); Schistosoma japonicum glutathione-S-transferase(GST) gene sequences are available from Pharmacia Biotech (the pGEXseries which have GenBank Accession Nos.: U13849 to U13858);β-galactosidase (the lacZ gene product) gene sequences are availablefrom Pharmacia Biotech (pCH110 and pMC1871; GenBank Accession Nos:U13845 and L08936, respectively); sequences encoding the IgG bindingdomains of Protein A are available from Pharmacia Biotech (pRIT2T;GenBank Accession No. U13864)).

When any of the above listed proteins (including the hinge/Fc domains ofhuman IgG₁) are used as spacers, it is not required that the entireprotein be used as a spacer. Portions of these proteins may be used asthe spacer provided the portion selected is sufficient to permitinteraction of a fusion protein containing the portion of the proteinused as the spacer with the desired affinity resin.

Expression and Purification

The polypeptides, proteins and protein fragments of the presentinvention are generally prepared and expressed as a fusion protein usingconventional recombinant DNA technology. The fusion protein is thusproduced by host cells transformed with the genetic information encodingthe fusion protein. The host cells may secrete the fusion protein intothe culture media or store it in the cells whereby the cells must becollected and disrupted in order to extract the product. As hosts, E.coli, yeast, insect cells, mammalian cells and plants are suitable. Ofthese two, E. coli will typically be the more preferred host for mostapplications. In one embodiment, the recombinant polypeptides, proteinsand protein fragments are produced in a soluble form or secreted fromthe host.

In general, a chimeric gene is inserted into an expression vector whichallows for the expression of the desired fusion protein in a suitabletransformed host. The expression vector provides the inserted chimericgene with the necessary regulatory sequences to control expression inthe suitable transformed host.

There are six elements of control expression sequence for proteins whichare to be secreted from a host into the medium, while five of theseelements apply to fusion proteins expressed intracellularly. Theseelements in the order they appear in the gene are: a) the promoterregion; b) the 5′ untranslated region; c) signal sequence; d) thechimeric coding sequence; e) the 3′ untranslated region; f) thetranscription termination site. Fusion proteins which are not secreteddo not contain c), the signal sequence.

The recombinant expression vectors of the invention comprise a nucleicacid of the invention in a form suitable for expression of the nucleicacid in a host cell. This means that the recombinant expression vectorsinclude one or more regulatory sequences, selected on the basis of thehost cells to be used for expression, operably linked to the nucleicacid sequence to be expressed. It will be appreciated by those skilledin the art that the design of the expression vector can depend on suchfactors as the choice of the host cell to be transformed, the level ofexpression of protein desired, etc. The expression vectors of theinvention can be introduced into host cells to thereby produce proteinsor peptides, including fusion proteins or peptides, encoded by nucleicacids as described herein.

Vector DNA can be introduced into prokaryotic or eukaryotic cells viaconventional transformation or transfection techniques. For stabletransfection of mammalian cells, it is known that, depending upon theexpression vector and transfection technique used, only a small fractionof cells may integrate the foreign DNA into their genome. In order toidentify and select these integrants, a gene that encodes a selectablemarker (e.g., for resistance to antibiotics) is generally introducedinto the host cells along with the gene of interest. Preferredselectable markers include those which confer resistance to drugs, suchas G418, hygromycin, and methotrexate. Nucleic acid encoding aselectable marker can be introduced into a host cell on the same vectoras that encoding the metal ion-affinity peptide containing fusionprotein or can be introduced on a separate vector. Cells stablytransfected with the introduced nucleic acid can be identified by drugselection (e.g., cells that have incorporated the selectable marker genewill survive, while the other cells die). Methods and materials forpreparing recombinant vectors, transforming host cells using replicatingvectors, and expressing biologically active foreign polypeptides andproteins are generally well known.

The expressed recombinant polypeptides, proteins and protein fragmentsmay be separated from other material present in the secretion media orextraction solution, or from other liquid mixtures, through immobilizedmetal affinity chromatography (“IMAC”). For example, the culture mediacontaining the secreted recombinant polypeptides, proteins and proteinfragments or the cell extracts containing the recombinant polypeptides,proteins and protein fragments may be passed through a column thatcontains a resin comprising an immobilized metal ion. In IMAC, metalions are immobilized onto to a solid support, and used to captureproteins comprising a metal chelating peptide. The metal chelatingpeptide may occur naturally in the protein, or the protein may be arecombinant protein with an affinity tag comprising a metal chelatingpeptide. Exemplary metal ions include aluminum, cadmium, calcium,cobalt, copper, gallium, iron, nickel, ytterbium and zinc. In oneembodiment, the metal ion is preferably nickel, copper, cobalt, or zinc.In another embodiment, the metal ion is nickel. Advantageously, thecomponents of the solution other than recombinant polypeptide, proteinor protein fragment freely pass through the column. The immobilizedmetal, however, chelates or binds the recombinant polypeptides, proteinsand protein fragments, thereby separating it from the remaining contentsof the liquid mixture in which it was originally contained.

Resins useful for producing immobilized metal ion affinitychromatography (IMAC) columns are available commercially. Examples ofresins derivatized with iminodiacetic acid (IDA) are Chelating Sepharose6B (Pharmacia), Immobilized Iminodiacetic Acid (Pierce), andIminodiacetic Acid Agarose (Sigma-Aldrich). In addition, Porath hasimmobilized tris(carboxymethyl)ethylenediamine (TED) on Sepharose 6B andused it to fractionate serum proteins. Porath, J. and Olin, B.,Biochemistry, 22:1621-1630, 1983. Other reports suggest that trisacrylGF2000 and silica can be derivatized with IDA, TED, or aspartic acid,and the resulting materials used in producing IMAC substances.

In one embodiment, the capture ligand is a metal chelate as described inWO 01/81365. More specifically, in this embodiment the capture ligand isa metal chelate derived from metal chelating composition (1):

wherein

-   -   Q is a carrier;    -   S¹ is a spacer;    -   L is -A-T-CH(X)— or —C(═O)—;    -   A is an ether, thioether, selenoether, or amide linkage;    -   T is a bond or substituted or unsubstituted alkyl or alkenyl;    -   X is —(CH₂)_(k)CH₃, —(CH₂)_(k)COOH,—(CH₂)_(k)SO₃H,        —(CH₂)_(k)PO₃H₂, —(CH₂)_(k)N(J)₂, or —(CH₂)_(k)P(J)₂, preferably        —(CH₂)_(k)COOH or —(CH₂)_(k)SO₃H;    -   k is an integer from 0 to 2;    -   J is hydrocarbyl or substituted hydrocarbyl;    -   Y is —COOH, —H, —SO₃H, —PO₃H₂, —N(J)₂, or —P(J)₂, preferably,        —COOH;    -   Z is —COOH, —H, —SO₃H, —PO₃H₂, —N(J)₂, or —P(J)₂, preferably,        —COOH; and    -   i is an integer from 0 to 4, preferably 1 or 2.

In general, the carrier, Q, may comprise any solid or soluble materialor compound capable of being derivatized for coupling. Solid (orinsoluble) carriers may be selected from a group including agarose,cellulose, methacrylate co-polymers, polystyrene, polypropylene, paper,polyamide, polyacrylonitrile, polyvinylidene, polysulfone,nitrocellulose, polyester, polyethylene, silica, glass, latex, plastic,gold, iron oxide and polyacrylamide, but may be any insoluble or solidcompound able to be derivatized to allow coupling of the remainder ofthe composition to the carrier, Q. Soluble carriers include proteins,nucleic acids including DNA, RNA, and oligonucleotides, lipids,liposomes, synthetic soluble polymers, proteins, polyamino acids,albumin, antibodies, enzymes, streptavidin, peptides, hormones,chromogenic dyes, fluorescent dyes, flurochromes or any other detectionmolecule, drugs, small organic compounds, polysaccharides and any othersoluble compound able to be derivatized for coupling the remainder ofthe composition to the carrier, Q. In one embodiment, the carrier, Q, isthe container of the present invention. In another embodiment, thecarrier, Q, is a body provided within the container of the presentinvention.

The spacer, S¹, which flanks the carrier comprises a chain of atomswhich may be saturated or unsaturated, substituted or unsubstituted,linear or cyclic, or straight or branched. Typically, the chain of atomsdefining the spacer, S¹, will consist of no more than about 25 atoms;stated another way, the backbone of the spacer will consist of no morethan about 25 atoms. More preferably, the chain of atoms defining thespacer, S¹, will consist of no more than about 15 atoms, and still morepreferably no more than about 12 atoms. The chain of atoms defining thespacer, S¹, will typically be selected from the group consisting ofcarbon, oxygen, nitrogen, sulfur, selenium, silicon and phosphorous andpreferably from the group consisting of carbon, oxygen, nitrogen, sulfurand selenium. In addition, the chain atoms may be substituted orunsubstituted with atoms other than hydrogen such as hydroxy, keto (═O),or acyl such as acetyl. Thus, the chain may optionally include one ormore ether, thioether, selenoether, amide, or amine linkages betweenhydrocarbyl or substituted hydrocarbyl regions. Exemplary spacers, S¹,include methylene, alkyleneoxy (—(CH₂)_(a)O—), alkylenethioether(—(CH₂)_(a)S—), alkyleneselenoether (—(CH₂)_(a)Se—), alkyleneamide(—(CH₂)_(a)NR¹(C═O)—), alkylenecarbonyl (—(CH₂)_(a)CO)—, andcombinations thereof wherein a is generally from 1 to about 20 and R¹ ishydrogen or hydrocarbyl, preferably alkyl. In one embodiment, thespacer, S¹, is a hydrophilic, neutral structure and does not contain anyamine linkages or substituents or other linkages or substituents whichcould become electrically charged during the purification of apolypeptide.

As noted above, the linker, L, may be -A-T-CH(X)— or —C(═O)—. When L is-A-T-CH(X)—, the chelating composition corresponds to the formula:

wherein Q, S¹, A, T, X, Y, and Z are as previously defined. In thisembodiment, the ether (—O—), thioether (—S—), selenoether (—Se—) oramide ((—NR¹(C═O)—) or (—(C═O)NR¹—) wherein R¹ is hydrogen orhydrocarbyl) linkage is separated from the chelating portion of themolecule by a substituted or unsubstituted alkyl or alkenyl region. Ifother than a bond, T is preferably substituted or unsubstituted C₁ to C₆alkyl or substituted or unsubstituted C₂ to C₆ alkenyl. More preferably,A is —S—, T is —(CH₂)_(n)—, and n is an integer from 0 to 6, typically 0to 4, and more typically 0, 1 or 2.

When L is —C(═O)—, the chelating composition corresponds to the formula:

wherein Q, S¹, i, Y, and Z are as previously defined.

In one embodiment, the sequence —S¹-L-, in combination, is a chain of nomore than about 35 atoms selected from the group consisting of carbon,oxygen, sulfur, selenium, nitrogen, silicon and phosphorous, morepreferably only carbon, oxygen sulfur and nitrogen, and still morepreferably only carbon, oxygen and sulfur. To reduce the prospects fornon-specific binding, nitrogen, when present, is preferably in the formof an amide moiety. In addition, if the carbon chain atoms aresubstituted with anything other than hydrogen, they are preferablysubstituted with hydroxy or keto. In one embodiment, L comprises aportion (sometimes referred to as a fragment or residue) derived from anamino acid such as cystine, homocystine, cysteine, homocysteine,aspartic acid, cysteic acid or an ester thereof such as the methyl orethyl ester thereof.

Exemplary chelating compositions corresponding to formula 1 include thefollowing:

wherein Q is a carrier and Ac is acetyl.

In another embodiment, the capture ligand is a metal chelate of the typedescribed in U.S. Pat. No. 5,047,513. More specifically, in thisembodiment the capture ligand is a metal chelate derived fromnitrilotriacetic acid derivatives of the formula

wherein S² is —O—CH₂—CH(OH)—CH₂ or —O—CO— and x is 2, 3 or 4. In thisembodiment, the nitrilotriacetic acid derivative is immobilized on anyof the previously described carriers, Q.

In these embodiments in which the capture ligand is a metal chelate asdescribed in WO 01/81365 or U.S. Pat. No. 5,047,513, the metal chelatemay contain any of the metal ions previously described in connectionwith IMAC. In one embodiment, the metal chelate comprises a metal ionselected from among nickel (Ni²⁺), zinc (Zn²⁺), copper (Cu²⁺), iron(Fe³⁺), cobalt (Co²⁺), calcium (Ca²⁺), aluminum (Al³⁺), magnesium(Mg²⁺), and manganese (Mn²⁺). In another embodiment, the metal chelatecomprises nickel (Ni²⁺).

Another common purification technique that can be used in the context ofthe present invention is the use of an immunogenic capture system wherethe recombinant polypeptide, protein or protein fragment comprises anantigenic domain in a spacer region (Sp₁ or Sp₂). Any of the previouslydescribed antigenic systems comprising the spacer may be used for thispurpose. In such systems, an epitope tag on a protein or peptide allowsthe protein to which it is attached to be purified based upon theaffinity of the epitope tag for a corresponding ligand (e.g., antibody)immobilized on a support. One example of such a tag is the sequenceAsp-Tyr-Lys-Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 15), or DYKDDDDK (SEQ ID NO:15); antibodies having specificity for this sequence are sold bySigma-Aldrich (St. Louis, Mo.) under the FLAG® trademark. Anotherexample of such a tag is the sequence Asp-Leu-Tyr-Asp-Asp-Asp-Asp-Lys(SEQ ID NO: 16), or DLYDDDDK (SEQ ID NO: 16); antibodies havingspecificity for this sequence are sold by Invitrogen (Carlsbad, Calif.).Another example of such a tag is the 3× FLAG® sequenceMet-Asp-Tyr-Lys-Asp-His-Asp-Gly-Asp-Tyr-Lys-Asp-His-Asp-Ile-Asp-Tyr-Lys-Asp-Asp-Asp-Asp-Lys(SEQ ID NO: 17); antibodies having specificity for this sequence aresold by Sigma-Aldrich (St. Louis, Mo.). Thus, in one embodiment, thecarrier comprises immobilized antibodies which have specificity for theDYKDDDDK (SEQ ID NO: 15) epitope; in another embodiment, the carriercomprises immobilized antibodies which have specificity for the DLYDDDDK(SEQ ID NO: 16) epitope. In another embodiment, the carrier comprisesimmobilized antibodies which have specificity for SEQ ID NO: 17. Forexample, in one embodiment, the ANTI-FLAG® M1, M2, or M5 antibody isimmobilized on the interior surface of a column, or a portion thereof,and/or a bead or other support within a column.

After the recombinant polypeptides, proteins and protein fragments areseparated from other components of the liquid mixture, the conditions inthe column may be changed to release the bound material. For example,the bound molecules may be eluted by pH change, imidazole, orcompetition with another linker peptide from the column.

Alternatively, the target polypeptide, protein or protein fragmentportion of the bound recombinant polypeptide, protein or proteinfragment may be selectively released from immobilized metal. Forexample, if there is a cleavage site between the target polypeptide,protein or protein fragment and the metal ion-affinity peptide, and ifthe bound recombinant polypeptide, protein or protein fragment istreated with the appropriate enzyme, the target polypeptide, protein orprotein fragment may be selectively released while the metalion-affinity polypeptide fragment remains bound to the immobilizedmetal. For this purpose, the cleavage is preferably an enzymaticallycleavable linker peptide having the ability to undergo site-specificproteolysis. Suitable cleaving enzymes in accordance with this inventionare activated factor X (factor Xa), DPP I, DPP II, DPP IV,carboxylpeptidase A, collagen, enterokinase, human renin, thrombin,trypsin, ubtilisn and V5.

It is to be appreciated that some polypeptide or protein molecules willpossess the desired enzymatic or biological activity with the metalchelate peptide still attached either at the C-terminal end or at theN-terminal end or both. In those cases the purification of the chimericprotein will be accomplished without subjecting the protein tosite-specific proteolysis.

The present invention may be used to purify any prokaryotic oreukaryotic protein that can be expressed as the product of recombinantDNA technology in a transformed host cell. These recombinant proteinproducts include hormones, receptors, enzymes, storage proteins, bloodproteins, mutant proteins produced by protein engineering techniques, orsynthetic proteins. The purification process of the present inventioncan be used batchwise or in continuously run columns.

It is to be understood that the present invention has been described indetail by way of illustration and example in order to acquaint othersskilled in the art with the invention, its principles, and its practicalapplication. Further, the specific embodiments of the present inventionas set forth are not intended to be exhaustive or to limit theinvention, and that many alternatives, modifications, and variationswill be apparent to those skilled in the art in light of the foregoingexamples and detailed description. Accordingly, this invention isintended to embrace all such alternatives, modifications, and variationsthat fall within the spirit and scope of the following claims. Whilesome of the examples and descriptions above include some conclusionsabout the way the invention may function, the inventors do not intend tobe bound by those conclusions and functions, but put them forth only aspossible explanations in light of current understanding.

Abbreviations and Definitions

To facilitate understanding of the invention, a number of terms aredefined below. Definitions of certain terms are included here. Any termnot defined is understood to have the normal meaning used by scientistscontemporaneous with the submission of this application.

The term “expression vector” as used herein refers to nucleic acidsequences containing a desired coding sequence and appropriate nucleicacid sequences necessary for the expression of the operably linkedcoding sequence in a particular host organism. Nucleic acid sequencesnecessary for expression in prokaryotes include a promoter, a ribosomebinding site, an initiation codon, a stop codon, optionally an operatorsequence and possibly other regulatory sequences. Eukaryotic cellsutilize promoters, a Kozak sequence and often enhancers andpolyadenlyation signals. Prokaryotic cells also utilize a Shine-DalgarnoRibosome binding site. The present invention includes vectors orplasmids which can be used as vehicles to transform any viable host cellwith the recombinant DNA expression vector.

“Operably linked” is intended to mean that the nucleotide sequence ofinterest is linked to the regulatory sequence(s) in a manner that allowsfor expression of the nucleotide sequence (e.g., in an in vitrotranscription/translation system or in a host cell when the vector isintroduced into the host cell).

The term “regulatory sequence” is intended to include promoters,enhancers, and other expression control elements (e.g., polyadenylationsignals). Regulatory sequences include those that direct constitutiveexpression of a nucleotide sequence in many types of host cell and thosethat direct expression of the nucleotide sequence only in certain hostcells (e.g., tissue-specific regulatory sequences).

The terms “transformation” and “transfection” are intended to refer to avariety of art-recognized techniques for introducing foreign nucleicacid (e.g., DNA) into a host cell, including calcium phosphate orcalcium chloride co-precipitation, DEAE-dextran-mediated transfection,lipofection, or electroporation. Suitable methods for transforming ortransfecting host cells can be found in laboratory manuals.

The term “hydrophilic” when used in reference to amino acids refers tothose amino acids which have polar and/or charged side chains.Hydrophilic amino acids include lysine, arginine, histidine, aspartate(i.e., aspartic acid), glutamate (i.e., glutamic acid), serine,threonine, cysteine, tyrosine, asparagine and glutamine.

The term “hydrophobic” when used in reference to amino acids refers tothose amino acids which have nonpolar side chains. Hydrophobic aminoacids include valine, leucine, isoleucine, cysteine and methionine.Three hydrophobic amino acids have aromatic side chains. Accordingly,the term “aromatic” when used in reference to amino acids refers to thethree aromatic hydrophobic amino acids phenylalanine, tyrosine andtryptophan.

The term “fusion protein” refers to polypeptides and proteins whichconsist of a metal ion-affinity linker peptide and a protein orpolypeptide operably linked directly or indirectly to the metalion-affinity peptide. The metal ion-affinity linker peptide may belocated at the amino-terminal portion of the fusion protein or at thecarboxy-terminal protein thus forming an “amino-terminal fusion protein”or a “carboxy-terminal fusion protein,” respectively.

The terms “metal ion-affinity peptide”, “metal binding peptide” and“linker peptide” are used interchangeably to refer to an amino acidsequence which displays an affinity to metal ions. The minimum length ofthe immobilized metal ion-affinity peptide according to the presentinvention is seven amino acids including four alternating histidines.The most preferred length is seven amino acids including fouralternating histidines.

The term “enzyme” referred to herein in the context of a cleavage enzymemeans a polypeptide or protein which recognizes a specific amino acidsequence in a polypeptide and cleaves the polypeptide at the scissilebond. In one embodiment of the present invention, enterokinase is theenzyme which is used to free the fusion protein from the immobilizedmetal ion column. In further embodiments, carboxylpeptidase A, DPP I,DPP II, DPP IV, factor Xa, human renin, TEV, thrombin or VIII proteaseis the enzyme.

The terms “cleavage site” used herein refers to an amino acid sequencewhich is recognized and cleaved by an enzyme or chemical means at thescissile bond.

The term “scissile bond” referred to herein is the juncture wherecleavage occurs; for example the scissile bond recognized byenterokinase may be the bond following the sequence (Asp₄)-Lys in thespacer peptide or affinity peptide.

By the term “immobilized metal ion-affinity peptide” as used herein ismeant an amino acid sequence that chelates immobilized divalent metalions of metals selected from the group consisting of aluminum, cadmium,calcium, cobalt, copper, gallium, iron, nickel, ytterbium and zinc.

The term “capture ligand” means any ligand or receptor that can beimmobilized or supported on a container or support and used to isolate acellular component from cellular debris. Some non-limiting examples ofcapture ligands that may be used in connection with the presentinvention include: biotin, streptavidin, various metal chelate ions,antibodies, various charged particles such as those for use in ionexchange chromatography, various affinity chromatography supports, andvarious hydrophobic groups for use in hydrophobic chromatography.

For all the nucleotide and amino acid sequences disclosed herein, it isunderstood that equivalent nucleotides and amino acids can besubstituted into the sequences without affecting the function of thesequences. Such substitutions are within the ability of a person ofordinary skill in the art.

The procedures disclosed herein which involve the molecular manipulationof nucleic acids are known to those skilled in the art.

EXAMPLES Example 1 Construction and Screening of a Metal Ion-AffinityPeptide Library

A pseudo-random glutathione-S-transferase C-terminal peptide library wasconstructed with the amino acid sequence of His-X-His-X-His-X-His whereX is any amino acid except Gln, His and Pro. The library vector wasconstructed from the bacterial expression vector pGEX-2T. The librarywas constructed by annealing a pair of complimentary oligonucleotidestogether. Oligonucleotides were constructed as follows:5′GATCCCATDNDCATDNDCATDNDCATTAAC3′ (SEQ ID NO: 18) and5′AATTGTTAATGHNHATGHNHATGHNHATGG3′ (SEQ ID NO: 19) where D isnucleotides A, G, or T, H is nucleotides A, C, or T and N is nucleotidesA, C, T, or G. The 5′ end was phosphorylated with T₄ polynucleotidekinase and the oligonucleotides were annealed together to generate acassette. The cassette was ligated into pGEX-2T, which had been digestedwith EcoRI and BamHI restriction endonucleases. Ligated vector wastransformed into E. coli DH5-α using standard protocols. Transformantswere plated on LB/ampicillin plates (100 mg/L) and incubated overnightat 37° C.

900 colonies were picked and placed on 9 master plates. Each masterplate contained 100 colonies each and were grown overnight at 37° C. Apiece of nitrocellulose was placed onto each of the master plates. Thispiece of nitrocellulose was then removed and the transferred colonieswere placed onto a LB/ampicillin plate containing 1 mM isopropylβ-D-galactopyranoside (IPTG) to induce the expression of the GST fusionpeptides. The cells were allowed to grow for an additional 4 hours at37° C. The nitrocellulose filter was removed from the plate and placedsequentially on blotting paper containing the following solutions tolyse the cells in situ:

-   -   (a) 10% SDS for 10 minutes,    -   (b) 1.5 M sodium chloride, 0.5 M sodium hydroxide for 5 minutes    -   (c) 1.5 M sodium chloride, 0.5 M Tris-HCl pH 7.4 for 5 minutes    -   (d) 1.5 M sodium chloride, 0.5 M Tris-HCl pH 7.4 for 5 minutes    -   (e) 2×SSC for 15 minutes.

The filters were dried at ambient temperature followed by an incubationin Tris-buffered saline (TBS) containing 3% non-fat dry milk for 1 hourat room temperature. Filters were then washed 3× for 5 minutes with TBScontaining 0.05% Tween-20 (TBS-T). To detect clones that were capable ofbinding to a metal ion, the filters were incubated with nickel NTAhorseradish peroxidase (HRP) at a concentration of 1 mg/ml in TBS-T for1 hour. The filters were then washed with TBS-T 3× for 5 minutes andincubated with 3-3′-5-5′-Tetramethylbenzidine (TMB) to detect thehorseradish peroxidase. The reaction was stopped by placing the filtersin water. 250 colonies, which were detected above, were picked from themaster plate and placed into 1 ml of LB/ampicillin and grown overnightin a 96 deep well plate at 37° C. at 250 rpm on an orbital shaker. 10 μlof the overnight cultures were transferred to a fresh aliquot ofLB/ampicillin (1 ml) in a 96 deep well plate and grown for an additional3 hours. The culture was then induced by adding IPTG (finalconcentration of 1 mM) and the culture was allowed to grow for anadditional 3 hours prior to harvesting by centrifugation. The media wasdecanted and the cells were frozen overnight at −20° C. in thecollection plate. Cells were lysed with 0.6 ml of CelLytic-B(Sigma-Aldrich product no. B3553) and incubated for 15 minutes at roomtemperature. The cell debris was removed by centrifugation at 3,000×gfor 15 minutes. Two experiments were done in parallel, one on aHis-Select High Sensitivity (HS) nickel coated plate, and the second onHIS-Select High Capacity (HC) nickel coated plate. 0.1 ml of cellextracts of each clone were placed in a HS microwell plate in thepresence of imidazole at a final concentration of 5 mM. This is theselective condition used for screening the different metal ion-affinityclones. HS plates were incubated for 4 hours at room temperature. Plateswere then washed 3× with phosphate-buffered saline (PBS) containing0.05% Tween 20 (PBS-T). The HS plates were then incubated with anti-GSTat 1:1,000 dilution in PBS-BSA buffer (0.2 ml/well) for 1 hour at roomtemperature. HS plates were washed 3× with PBS-T. The HS plates werethen incubated with anti-mouse HRP conjugate at 1:10,000 dilution inPBS-BSA buffer for 1 hour at room temperature. Plates were washed 3×with PBS-T. The plate was then developed with2,2′azino-bis(3-ethylbenzthiazoline-6-sulfonic acid) ABST substrate.Color development was stopped by the addition of sodium azide to a finalconcentration of 2 mM. Absorbance of the plates was read at 405 nm usinga Wallace 1420 plate reader. The HC plates were used to further analyzepotential clones. To further characterize the clones, 0.2 ml of cellextracts were applied to the HC plates and the plates were incubated atambient temperature for 1 hour. The plates were washed with PBS asdescribed above. Twenty-one clones that produced the highest response onthe HS plates were eluted from the corresponding HC plate. The selectedcloned proteins were eluted from the HC plates by incubating at 37° C.for 15 minutes in 50 mM sodium phosphate, 0.3 M sodium chloride and 0.2M imidazole buffer. Eluted proteins were then moved to clean tubes andanalyzed by SDS-PAGE. All 21 clones had the expected molecular weightand were sequence verified.

These 21 colonies were grown overnight in 1 ml LB/ampicillin media at37° C. at 250 rpm. 100 μl of the overnight cultures were transferred to50 ml of fresh LB/ampicillin media and the cultures grown for anadditional 3 hours at 37° C. The cultures were induced with IPTG (finalconcentration of 1 mM) and the cultures grown for an additional 3 hoursprior to harvesting by centrifugation.

Example 2 Construction of a N-Terminal Metal Ion-Affinity Fusion Protein

Two metal ion-affinity tags were introduced to the N-terminal ofbacterial alkaline phosphatase (BAP). The constructs were constructedfrom the BAP expression vector pFLAG-CTS-BAP. Construction was done byannealing two pair of complimentary oligonucleotides together. Thefollowing oligonucleotides were constructed:5′TATGCATAATCATCGACATGAACATA3′(SEQ ID NO: 20),5′AGCTTATGTTTATGTCGATGATTATGCA3′ (SEQ ID NO: 21),5′TATGCATAAACATAGACATGGGCATA3′ (SEQ ID NO: 22) and5′AGCTTGATGCCCATGTCTATGTTTATGCA3′ (SEQ ID NO: 23). The oligonucleotideswere annealed together to generate a cassette. The cassette was ligatedinto pFLAG-CTS-BAP, which had been digested with NdeI and HindIIIrestriction endonucleases. Ligated vector was transformed into E. coliDH5-a using standard protocols and plated on LB/ampicillin.

Example 3 Expression of an N-Terminal Metal Ion-Affinity Fusion Protein

MAT-BAP fusion peptide cultures were grown overnight in 1 mlLB/ampicillin at 37° C. 500 μl of overnight cultures were transferred to500 ml of fresh TB media containing ampicillin (100 mg/L). The cultureswere grown for three hours at 37° C. at 250 rpm. Protein expression wasinduced by the addition of IPTG (final concentration of 1 mM). Cultureswere then grown for an additional three hours, harvested bycentrifugation and stored at −70° C. until further use.

Example 4 Metal Ion-Affinity Fusion Protein Purification Protocol #1

Cells were resuspended in 2 ml of TE (50 mM Tris-HCl pH 8.0, 2 mM EDTA).Lysozyme (4 mg/ml in 2 ml of TE) was added to the resuspended cells andthe cells were lysed at ambient temperature for 4 hours. The cell debriswas removed by centrifugation at 27,000×g for 15 minutes. Thesupernatant was dialyzed overnight against 50 mM Tris-HCl pH 8.0 toremove the EDTA. The dialyzed supernatant was applied to a 1 ml columncontaining a nickel bis-carboxy-methyl-cysteine resin (nickel resin).The column was washed with 4 ml of 50 mM Tris-HCl pH 8.0 and then washedwith 2 ml of 50 mM Tris-HCl pH 8.0, 10 mM imidazole. The column was theneluted 50 mM Tris-HCl pH 8.0 250 mM imidazole. Samples were analyzed forpurity by SDS-PAGE.

Example 5 Metal Ion-Affinity Fusion protein Purification Protocol #2

Cells were resuspended with CelLytic B (Sigma-Aldrich product no.B3553), and 10 mM imidazole. The cells were solubilized by incubationfor 15 minutes. The cell debris was removed by centrifugation at15,000×g for 5 minutes at room temperature. A 0.5 ml column, containingnickel resin, was equilibrated with 10 column volumes (5 ml) of 50 mMsodium phosphate, pH 8, and 300 mM sodium chloride (column buffer). Thesupernatant was loaded on the column. The column was washed with 10column volumes (5 ml) of 10 mM imidazole in column buffer. The columnwas eluted with 100 mM imidazole in column buffer. The samples wereanalyzed for specificity by SDS-PAGE.

Example 6 Metal Ion-Affinity Fusion Protein Purification Protocol #3:Use of Chaotropic Agents

The cells were resuspended in 100 mM sodium phosphate, pH 8, and 8 Murea (denaturant column buffer). The cells were solubilized bysonication three times, 15 seconds each, with a probe sonicator. Celldebris was removed by centrifugation at 15,000×g for 5 minutes at roomtemperature. A 0.5 ml column, containing nickel resin, was equilibratedwith 10 column volumes (5 ml) of the denaturant column buffer. Thesupernatant was loaded on the column and the column was washed with 10column volumes (5 ml) of denaturant column buffer. The column wassequentially eluted with 100 mM sodium phosphate, 8 M urea at pH 7.5,7.0, 6.5, 6.0, 5.5, 5.0 and 4.5. The samples were analyzed forspecificity by SDS-PAGE.

1. A polypeptide, protein or protein fragment represented by the formulaR₁-Sp₁-(His-Z₁-His-Arg-His-Z₂-His)-Sp₂-R₂, wherein(His-Z₁-His-Arg-His-Z₂-His) (SEQ ID NO: 24) is a metal ion-affinitypeptide, R₁ is hydrogen, a polypeptide, protein or protein fragment, Sp₁is a covalent bond or a spacer comprising at least one amino acidresidue, R₂ is hydrogen, a polypeptide, protein or protein fragment, Sp₂is a covalent bond or a spacer comprising at least one amino acidresidue, Z₁ is an amino acid residue selected from the group consistingof Ala, Asn, Asp, Gln, Glu, Ile, Lys, Phe, Pro, Ser, Thr, Trp, and Val,and Z₂ is an amino acid residue selected from the group consisting ofAla, Asn, Asp, Cys, Gln, Glu, Gly, Ile, Leu, Lys, Met, Pro, Ser, Thr,Tyr, and Val.
 2. The polypeptide, protein or protein fragment of claim1, wherein Z₁ is selected from the group consisting of Ala, Asn, Ile,Lys, Phe, Ser, Thr, and Val, and Z₂ is selected from the groupconsisting of Ala, Asn, Gly, Lys, Ser, Thr and Tyr.
 3. The polypeptide,protein or protein fragment of claim 1, wherein Z₁ and Z₂ are selectedfrom the group consisting of: (a) Z₁ is Asn and Z₂ is Gly; (b) Z₁ is Asnand Z₂ is Lys (c) Z₁ is Lys and Z₂ is Gly. (d) Z₁ is Lys and Z₂ is Lys.(e) Z₁ is Ile and Z₂ is Asn; (f) Z₁ is Thr and Z₂ is Ser; (g) Z₁ is Serand Z₂ is Tyr; (h) Z₁ is Val and Z₂ is Ala; and (i) Z₁ is Ala and Z₂ isLys.
 4. The polypeptide, protein or protein fragment of claim 1, whereinR₁ or R₂ is hydrogen.
 5. The polypeptide, protein or protein fragment ofclaim 1, wherein R₁ or R₂ is an amino acid residue.
 6. The polypeptide,protein or protein fragment of claim 1, wherein Sp₁ or Sp₂ is a spacercomprising a proteolytic cleavage site, a fusion protein, a secretionsequence, a leader sequence for cellular targeting an antibody epitopeor an internal ribosomal sequences.
 7. The polypeptide, protein orprotein fragment of claim 1, wherein Sp₁ or Sp₂ is a spacer comprising aproteolytic cleavage site.
 8. The polypeptide, protein or proteinfragment of claim 7, wherein the proteolytic cleavage site is cleavedwith enterokinase.
 9. The polypeptide, protein or protein fragment ofclaim 1, wherein any one of Sp₁, Sp₂, R₁ and R₂ comprises at least oneof the amino acid sequences selected from the group consisting of SEQ IDNOS: 1-17.
 10. The polypeptide, protein or protein fragment of claim 1,wherein Sp₁ or Sp₂ is a spacer comprising the enzymeglutathione-S-transferase of the parasite helminth Schistosomajaponicum.
 11. The polypeptide, protein or protein fragment of claim 1,wherein Sp₁ or Sp₂ is a spacer comprising the amino acid sequenceDYKDDDDK (SEQ ID NO: 15).
 12. The polypeptide, protein or proteinfragment of claim 1, wherein Sp₁ or Sp₂ is a spacer comprising the aminoacid sequence DLYDDDDK (SEQ ID NO: 16).
 13. The polypeptide, protein orprotein fragment of claim 1, wherein Sp₁ or Sp₂ is a spacer comprisingthe amino acid sequenceMet-Asp-Tyr-Lys-Asp-His-Asp-Gly-Asp-Tyr-Lys-Asp-His-Asp-Ile-Asp-Tyr-Lys-Asp-Asp-Asp-Asp-Lys(SEQ ID NO: 17).
 14. The polypeptide, protein or protein fragment ofclaim 1, wherein Sp¹ or Sp² is a spacer comprising at least one aminoacid residue, said spacer comprising an antigenic domain, wherein theantigenic domain comprises the sequence (SEQ ID NO: 39)X²⁰-(X¹-Y-K-X²-X³-D-X⁴)_(n)-X⁵-(X¹-Y-K-X⁷-X⁸-D-X⁹-K)-X²¹

where: D, Y and K are their representative amino acids; X²⁰ and X²¹ areindependently a hydrogen or a covalent bond; each X¹ and X⁴ isindependently a covalent bond or at least one amino acid residueselected from the group consisting of aromatic amino acid residues andhydrophilic amino acid residues; each X², X³, X⁷ and X⁸ is independentlyan amino acid residue selected from the group consisting of aromaticamino acid residues and hydrophilic amino acid residues; X⁵ is acovalent bond or a spacer domain, the spacer domain comprising at leastone amino acid or a combination of multiple or alternating histidineresidues, said combination comprising His-Gly-His, or -(His-X)_(m)—,wherein m is 1 to 6 and X is selected from the group consisting of Ala,Arg, Asn, Asp, Cys, Gln, Glu, Gly, His, Ile, Leu, Lys, Met, Phe, Pro,Ser, Thr, Trp, Tyr and Val; X⁹ is a covalent bond or an aspartateresidue; and n is 0, 1 or
 2. 15. The polypeptide, protein or proteinfragment of claim 1, wherein Sp¹ or Sp² is a spacer comprising at leastone amino acid residue, said spacer comprising an antigenic domain,wherein the antigenic domain comprises the sequence (SEQ ID NO: 40)X²⁰-(D-Y-K-X²-X³-D)_(n)-X⁵-(D-Y-K-X⁷-X⁸-D-X⁹-K)-X²¹

where: D, Y, K are their representative amino acids; X²⁰ and X²¹ areindependently a hydrogen or a covalent bond; each X², X³, X⁷ and X⁸ isindependently an amino acid residue selected from the group consistingof aromatic amino acid residues and hydrophilic amino acid residues; X⁵is a covalent bond or a spacer domain, the spacer domain comprising atleast one amino acid or a combination of multiple or alternatinghistidine residues, said combination comprising His-Gly-His, or-(His-X)_(m)—, wherein m is 1 to 6 and X is selected from the groupconsisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, His, Ile, Leu,Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr and Val; X⁹ is a covalent bond oran aspartate residue; and n is at least
 2. 16. The polypeptide, proteinor protein fragment of claim 1, wherein Sp¹ or Sp² is a spacercomprising at least one amino acid residue, said spacer comprising anantigenic domain, wherein the antigenic domain comprises the sequence(SEQ ID NO: 45) X²⁰-X¹⁰-(D-Y-K-X²-X³-D)_(n)-X⁵-(D-Y-K-X⁷-X⁸-D-X⁹-K)-X²¹

where: D, Y, and K are their representative amino acids; X²⁰ and X²¹ areindependently a hydrogen or a covalent bond; X¹⁰ is a covalent bond oran amino acid; each X², X³, X⁷ and X⁸ is independently an amino acidresidue selected from the group consisting of aromatic amino acidresidues and hydrophilic amino acid residues; X⁵ is a covalent bond or aspacer domain, the spacer domain comprising at least one amino acid or acombination of multiple or alternating histidine residues, saidcombination comprising His-Gly-His, or -(His-X)_(m)—, wherein m is 1 to6 and X is selected from the group consisting of Ala, Arg, Asn, Asp,Cys, Gln, Glu, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp,Tyr, and Val; X⁹ is a covalent bond or an aspartate residue; and n is atleast
 2. 17. The polypeptide, protein or protein fragment of claim 1,wherein Sp¹ or Sp² is a spacer comprising at least one amino acidresidue, said spacer comprising an antigenic domain, wherein theantigenic domain comprises the sequence (SEQ ID NO: 42)X²⁰-(D-X¹¹-Y-X¹²-X¹³)n-X¹⁴-(D-X¹¹-Y-X¹²-X¹³-D-X¹⁵-K)-X²¹

where: D, Y and K are their representative amino acids; X²⁰ and X²¹ areindependently a hydrogen or a covalent bond; each X¹¹ is a covalent bondor an amino acid; each X¹² is an amino acid selected from the groupconsisting of aromatic amino acid residues and hydrophilic amino acidresidues; each X¹³ is a covalent bond or at least one amino acidselected from the group consisting of aromatic amino acid residues andhydrophilic amino acid residues; X¹⁴ is a covalent bond or a spacerdomain, the spacer domain comprising at least one amino acid oralternating histidine residues, said combination comprising His-Gly-His,or -(His-X)_(m)—, wherein m is 1 to 6 and X is selected from the groupconsisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, His, Ile, Leu,Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr and Val; X¹⁵ is a covalent bondor an aspartate residue; and n is 0 or at least
 1. 18. A polypeptide,protein or protein fragment represented by the formulaR₁-Sp₁-(His-Z₁-His-Arg-His-Z₂-His)_(t)-Sp₂-R₂, wherein(His-Z₁-His-Arg-His-Z₂-His) (SEQ ID NO: 24) is a metal ion-affinitypeptide, t is at least 2, R₁ is hydrogen, a polypeptide, protein orprotein fragment, Sp₁ is a covalent bond or a spacer comprising at leastone amino acid residue, R₂ is hydrogen, a polypeptide, protein orprotein fragment, Sp₂ is a covalent bond or a spacer comprising at leastone amino acid residue, Z₁ is an amino acid residue selected from thegroup consisting of Ala, Arg, Asn, Asp, Gln, Glu, Ile, Lys, Phe, Pro,Ser, Thr, Trp, and Val, and Z₂ is an amino acid residue selected fromthe group consisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, Ile,Leu, Lys, Met, Pro, Ser, Thr, Tyr, and Val.
 19. The peptide of claim 18,wherein Z₁ and Z₂ are selected from the group consisting of: (a) Z₁ isAsn and Z₂ is Lys; and (b) Z₁ is Lys and Z₂ is Gly.
 20. A polypeptide,protein or protein fragment represented by the formulaR₁-Sp₁-[(His-Z₁-His-Arg-His-Z₂-His)-Sp₂]_(t)-R₂, wherein(His-Z₁-His-Arg-His-Z₂-His) (SEQ ID NO: 24) is a metal ion-affinitypeptide, t is at least 2, R₁ is hydrogen, a polypeptide, protein orprotein fragment, Sp₁ is a covalent bond or a spacer comprising at leastone amino acid residue, R₂ is hydrogen, a polypeptide, protein orprotein fragment, Sp₂ is a covalent bond or a spacer comprising at leastone amino acid residue, Z₁ is an amino acid residue selected from thegroup consisting of Ala, Arg, Asn, Asp, Gln, Glu, Ile, Lys, Phe, Pro,Ser, Thr, Trp, and Val, and Z₂ is an amino acid residue selected fromthe group consisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, Ile,Leu, Lys, Met, Pro, Ser, Thr, Tyr, and Val; and each Sp₂ of therecombinant polypeptides, proteins or protein fragments may be the sameor different.
 21. The peptide of claim 20, wherein Z₁ and Z₂ areselected from the group consisting of: (a) Z₁ is Asn and Z₂ is Lys; and(b) Z₁ is Lys and Z₂ is Gly.